Most Victorian Population is concentrated in the Melbourne City Region. Other regions Though large have a less population
| SA4_CODE_2016 | femalepopulation | malepopulation | population |
|---|---|---|---|
| 201 | 32726 | 34691 | 67417 |
| 202 | 32396 | 34054 | 66450 |
| 203 | 60660 | 64307 | 124967 |
| 204 | 35934 | 39614 | 75548 |
| 205 | 52929 | 57572 | 110501 |
| 206 | 159362 | 160819 | 320181 |
| 207 | 81814 | 86786 | 168600 |
| 208 | 96482 | 101671 | 198153 |
| 209 | 109370 | 122195 | 231565 |
| 210 | 71224 | 85167 | 156391 |
| 211 | 118179 | 129501 | 247680 |
| 212 | 151481 | 184164 | 335645 |
| 213 | 147830 | 178340 | 326170 |
| 214 | 62731 | 68190 | 130921 |
| 215 | 29867 | 33492 | 63359 |
| 216 | 25915 | 28796 | 54711 |
| 217 | 26236 | 29297 | 55533 |
| 297 | 0 | 9 | 9 |
| 299 | 765 | 1229 | 1994 |
Highest people are are Health Care Professionals and the ratio between men to women is less than one.
Similarly, in construction more men are employed as labourers.
The population of women in the education sector is far exceeds that of men.
Management & Commerce is the field that the most population have studied.
More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.
More women have studied Management and Commerce, however more men are employed as managers.
Victorian population is educated upto level 7 and most are employed as professionals.
However, a large population is employed as labourers when the population share of people who studied below high school is very less.
GenderLinearModel shows the relationship between male and female populations
Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.
Majority of male residents achieved at the level 3 and 4.
| afq_level | age_min | population |
|---|---|---|
| Level 1 & 2 | 15 | 9402 |
| Level 3 & 4 | 25 | 146297 |
| Level 5 & 6 | 25 | 96920 |
| Level 7 | 25 | 245613 |
| Level 9 | 25 | 83204 |
| Not Stated | 25 | 70455 |
| Level 8 | 35 | 28908 |
| industry | age_min | population |
|---|---|---|
| Accommodation_and_food_services | 25 | 42103 |
| Administrative_and_support_services | 25 | 23086 |
| Arts_and_recreation_services | 25 | 13149 |
| Construction | 25 | 61959 |
| Electricity_gas_water_and_waste_service | 25 | 8039 |
| Financial_and_insurance_services | 25 | 32021 |
| Health_care_and_social_assistance | 25 | 80994 |
| Information_media_and_telecommunications | 25 | 14702 |
| Not Stated | 25 | 29901 |
| Other_services | 25 | 24089 |
| Professional_scientific_and_technical_services | 25 | 64125 |
| Rental_hiring_and_real_estate_services | 25 | 11796 |
| Retail_trade | 25 | 61803 |
| Mining | 35 | 2441 |
| Wholesale_trade | 35 | 22199 |
| Education_and_training | 45 | 56125 |
| Manufacturing | 45 | 55206 |
| Public_administration_and_safety | 45 | 37747 |
| Transport_postal_and_warehousing | 45 | 32663 |
| Agriculture_forestry_and_fishing | 55 | 12733 |
| field | age_min | population |
|---|---|---|
| Mixed_Field_Programmes | 15 | 1813 |
| Architecture_and_Building | 25 | 42510 |
| Creative_Arts | 25 | 40334 |
| Food_Hospitality_and_Personal_Services | 25 | 42938 |
| Health | 25 | 67630 |
| Information_Technology | 25 | 37535 |
| Management_and_Commerce | 25 | 150571 |
| Natural_and_Physical_Sciences | 25 | 22171 |
| Not Stated | 25 | 71440 |
| Society_and_Culture | 25 | 80932 |
| Agriculture_Environment | 35 | 13016 |
| Engineering_and_Technologies | 45 | 77524 |
| Education | 55 | 44696 |
| NA | NA | 896 |
| occupation | age_min | population |
|---|---|---|
| Community_and_personal_service_workers | 25 | 67104 |
| Not Stated | 25 | 11075 |
| Professionals | 25 | 190449 |
| Sales_workers | 25 | 51772 |
| Technicians_and_trades_workers | 25 | 99110 |
| Managers | 35 | 100601 |
| Clerical_and_administrative_workers | 45 | 89021 |
| Labourers | 45 | 49653 |
| Machinery_operators_and_drivers | 45 | 40922 |
The bar plots represent the SA4 regions and its working population with respect to their education levels, field of study, industry of employment and occupations.
It can be observed that the region 206 had the most number of people with highest education levels which justifies that highest number of people in region 2016 were employed as professionals in their respective industries.
Management and commerce, engineering and technology were the fields of study for most population and agriculture, environment and mixed field programs had the least population share.
Health care, manufacturing and retail trade were the industries with most population while people were employed most for occupations of Professionals and Managers.
Best education level of each region
Best field of each region
The maps represent the SA4 regions and the distribution of population by their education levels, industries, field of study and occupations respectively.
Most population has completed education level 7 with management and commerce as their respective fields of study.
It can be observed that the highest number of people are employed in the occupations: Professionals, Managers and Technicians and trade workers.
Major industry in the city side is healthcare and the country regions are more operational in agricultural activities.
Spatial Education Level Distribution
Spatial Industry Distribution
Spatial Study Field Distribution
Spatial Occupation Distribution
Conclusion
The education levels, field of study, industry of employment and occupation was studied for the Victorian SA4 level populations for the distributions according to gender and sex. The tables and plots were compared to mark the covariations between the population distributions.For example, the population trend between the field of study and industry of employment. Networks were drawn based on the population weights to analyze these trends. Some of the trends like more men were employed as managers when more women had studied management were found to be interesting. Cholropeth maps were made to analyze these trends spatially.
The goal of this report is to create a data story from these statistical summaries to enumerate the facts from the data and link them to the real world. The data provided by the Australian Bureau of Statistics is an aggregated open data and in no form identifies individuals who participated in the census. The ABS aims to integrate the census data with other datasets to make this census data more interesting. Thus, we aim to do the same and bring some interesting data stories as we progress building this report.
Australian Bureau of Statistics (2016) ‘Census GeoPackages’, GeoPackages, accessed May 2021.
Australian Bureau of Statistics (2016) ‘Census DataPacks’, Census DataPacks, accessed May 2021.
Australian Bureau of Statistics (2016) ‘Census DataPacks’, Census DataPacks , accessed May 2021.
Australian Bureau of Statistics (2016)
Australian Statistical Geography Standard (ASGS)
---
title: "ETC5513 Assignment4 -Team StarWars"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill
navbar:
- { title: "About", href: "https://github.com/mohammedfaizan0014/etc5513-assignment-4-star-wars/blob/main/README.md", align: left }
social: [ "twitter", "facebook", "menu" ]
source_code: embed
---
```{r echo=FALSE, include=FALSE}
knitr::opts_chunk$set(fig.path = "Figures/", fig.align ="center",
out.width = "50%", echo = FALSE,
messages = FALSE,
warning = FALSE)
# Loading Libraries
library(tidyverse)
library(readr)
library(kableExtra)
library(tinytex)
library(bookdown)
library(naniar)
library(visdat)
library(citation)
library(knitr)
library(scales)
library(patchwork)
library(sf)
library(glue)
library(unglue)
library(sugarbag)
library(readxl)
library(plotly)
library(tidytext)
library(ggplot2)
library(igraph)
library(ggraph)
```
```{r}
data_path <- here::here("data/australian_census_data_2016/")
```
```{r}
data_path <- here::here("data/australian_census_data_2016/")
census_paths <- glue::glue(data_path, "/2016 Census GCP All Geographies for VIC/SA4/VIC/2016Census_G{number}{alpha}_VIC_SA4.csv",
number = c("46","46","47","47","47","51","51","51","51","57","57", "52", "52", "52", "52", "58", "58"), alpha = c("A","B","A","B","C","A","B","C","D","A","B", "A","B","C","D", "A","B"))
```
```{r geopath}
geopath <- glue::glue(data_path, "/2016_SA4_shape/SA4_2016_AUST.shp")
sa4_codes<- read_csv(census_paths[2]) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
select(SA4_CODE_2016)
sa4_geomap <- read_sf(geopath) %>%
right_join(sa4_codes, by=c("SA4_CODE16" = "SA4_CODE_2016"))
```
```{r g46read}
g46a<- read_csv(census_paths[1]) %>%
select(-starts_with("P"), -contains("Tot"), -contains("nfd"), -contains("IDes")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{educationlevel=GradDip_and_GradCert}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=PGrad_Deg}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=BachDeg}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=AdvDip_and_Dip}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Cert_III_IV}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Cert_I_II}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Lev_Edu_NS}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{educationlevel=Lev_Edu_NS|GradDip_and_GradCert|PGrad_Deg|BachDeg|AdvDip_and_Dip|Cert_III_IV|Cert_I_II}_{age_min=\\d+}ov"
),
remove = FALSE) %>%
select(-category)
```
```{r}
g46a <- g46a %>%
mutate(afq_level =case_when(str_detect(educationlevel, "GradDip_and_GradCert") ~ "Level 8",
str_detect(educationlevel, "PGrad") ~ "Level 9",
str_detect(educationlevel, "BachDeg") ~ "Level 7",
str_detect(educationlevel, "AdvDip_and_Dip") ~ "Level 5 & 6",
str_detect(educationlevel, "Cert_III_IV") ~ "Level 3 & 4",
str_detect(educationlevel, "Cert_I_II") ~ "Level 1 & 2",
str_detect(educationlevel, "Cert_Levl_nfd") ~ "Level 3 & 4",
str_detect(educationlevel, "Lev_Edu_IDes") ~ "Level Inadequately Described",
str_detect(educationlevel, "Lev_Edu_NS") ~ "Not Stated",
TRUE ~ educationlevel)) %>%
rename(count_edu_lvl = count)
```
```{r}
g47 <- map_dfr(census_paths[3:4], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot"), -contains("InadDes")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|NatPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}ov",
"{sex=[MF]}_{field=(Mgnt_Com|Society_Cult|Fd_Hosp_Psnl_Svcs|MixFld_Prgm|FldStd_NS|N{atPhyl_Scn|InfoTech|Eng_RelTec|ArchtBldng|Ag_Envir_Rltd_Sts|Health|Educ|Creative_Arts)}_{age_min=\\d+}_years_and_over"
),
remove = FALSE)
})
```
```{r}
g47 <- g47 %>%
mutate(field =case_when(
str_detect(field, "NatPhyl_Scn") ~ "Natural_and_Physical_Sciences",
str_detect(field, "InfoTech") ~ "Information_Technology",
str_detect(field, "Eng_RelTec") ~ "Engineering_and_Technologies",
str_detect(field, "ArchtBldng") ~ "Architecture_and_Building",
str_detect(field, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
str_detect(field, "Health") ~ "Health",
str_detect(field, "Educ") ~ "Education",
str_detect(field, "Mgnt_Com") ~ "Management_and_Commerce",
str_detect(field, "Society_Cult") ~ "Society_and_Culture",
str_detect(field, "Creative_Arts") ~ "Creative_Arts",
str_detect(field, "Fd_Hosp_Psnl_Svcs") ~ "Food_Hospitality_and_Personal_Services",str_detect(field, "MixFld_Prgm") ~ "Mixed_Field_Programmes",
str_detect(field, "FldStd_NS") ~ "Not Stated",
TRUE ~ field)) %>%
select(-category) %>%
rename(count_field = count)
```
```{r}
g51 <- map_dfr(census_paths[6:8], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}_{age_max=\\d+}",
"{sex=[MF]}_{industry=(Ag_For_Fshg|Mining|Manufact|El_Gas_Wt_Waste|Constru|WhlesaleTde|RetTde|Accom_food|Trans_post_wrehsg|Info_media_teleco|Fin_Insur|RtnHir_REst|Pro_scien_tec|Admin_supp|Public_admin_sfty|Educ_trng|HlthCare_SocAs|Art_recn|Oth_scs|ID_NS)}_{age_min=\\d+}ov"
),
remove = FALSE)
})
```
```{r}
g51 <- g51 %>%
mutate(industry =case_when(
str_detect(industry, "Ag_For_Fshg") ~ "Agriculture_forestry_and_fishing",
str_detect(industry, "Manufact") ~ "Manufacturing",
str_detect(industry, "El_Gas_Wt_Waste") ~ "Electricity_gas_water_and_waste_service",
str_detect(industry, "Constru") ~ "Construction",
str_detect(industry, "Ag_Envir_Rltd_Sts") ~ "Agriculture_Environment",
str_detect(industry, "WhlesaleTde") ~ "Wholesale_trade",
str_detect(industry, "RetTde") ~ "Retail_trade",
str_detect(industry, "Accom_food") ~ "Accommodation_and_food_services",
str_detect(industry, "Trans_post_wrehsg") ~ "Transport_postal_and_warehousing",
str_detect(industry, "Info_media_teleco") ~ "Information_media_and_telecommunications",
str_detect(industry, "Fin_Insur") ~ "Financial_and_insurance_services",
str_detect(industry, "RtnHir_REst") ~ "Rental_hiring_and_real_estate_services",
str_detect(industry, "Pro_scien_tec") ~ "Professional_scientific_and_technical_services",
str_detect(industry, "Admin_supp") ~ "Administrative_and_support_services",
str_detect(industry, "Public_admin_sfty") ~ "Public_administration_and_safety",
str_detect(industry, "Educ_trng") ~ "Education_and_training",
str_detect(industry, "HlthCare_SocAs") ~ "Health_care_and_social_assistance",
str_detect(industry, "Art_recn") ~ "Arts_and_recreation_services",
str_detect(industry, "Oth_scs") ~ "Other_services",
str_detect(industry, "ID_NS") ~ "Not Stated",
TRUE ~ industry)) %>%
select(-category) %>%
rename(count_industry = count)
```
```{r}
g57 <- map_dfr(census_paths[10], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}{age_min=\\d+}_{age_max=\\d+}_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
"{sex=[MF]}{age_min=\\d+}ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}",
"{sex=[MF]}{age_min=\\d+}_ov_{occupation=(Managers|Professionals|TechnicTrades_Wrs|CommunPersnlSvc_W|ClericalAdminis_W|Sales_W|Mach_oper_drivers|Labourers|Occu_ID_NS|TechnicTrades_W)}"
),
remove = FALSE)
})
```
```{r}
g57 <- g57 %>%
mutate(occupation =case_when(
str_detect(occupation, "TechnicTrades_W") ~ "Technicians_and_trades_workers",
str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
str_detect(occupation, "CommunPersnlSvc") ~ "Community_and_personal_service_workers",
str_detect(occupation, "ClericalAdminis_W") ~ "Clerical_and_administrative_workers",
str_detect(occupation, "Sales_W") ~ "Sales_workers",
str_detect(occupation, "Mach_oper_drivers") ~ "Machinery_operators_and_drivers",
str_detect(occupation, "Occu_ID_NS") ~ "Not Stated",
TRUE ~ occupation)) %>%
select(-category) %>%
rename(count_occupation = count)
```
```{r}
g52 <- map_dfr(census_paths[12:14], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}_{hr_max=\\d+}",
"{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}",
"{sex=[MF]}_{industry=(AgriForestFish|Min|Mnfg|EGW_WS|Cnstn|WTrade|RTrade|AccomFoodS|TransPostWhse|InfoMedTelecom|FinInsurS|RentHirREserv|ProScieTechServ|AdminSupServ|PubAdmiSafety|EducTrain|HealthCareSocA|ArtRecServ|OthServ|ID_NS)}_{hr_min=\\d+}over"
),
remove = FALSE)
})
```
```{r}
g52 <- g52 %>%
mutate(industry =case_when(
str_detect(industry, "AgriForestFish") ~ "Agriculture_forestry_and_fishing",
str_detect(industry, "Min") ~ "Mining",
str_detect(industry, "Mnfg") ~ "Manufacturing",
str_detect(industry, "EGW_WS") ~ "Electricity_gas_water_and_waste_service",
str_detect(industry, "Cnstn") ~ "Construction",
str_detect(industry, "WTrade") ~ "Wholesale_trade",
str_detect(industry, "RTrade") ~ "Retail_trade",
str_detect(industry, "AccomFoodS") ~ "Accommodation_and_food_services",
str_detect(industry, "TransPostWhse") ~ "Transport_postal_and_warehousing",
str_detect(industry, "InfoMedTelecom") ~ "Information_media_and_telecommunications",
str_detect(industry, "FinInsurS") ~ "Financial_and_insurance_services",
str_detect(industry, "RentHirREserv") ~ "Rental_hiring_and_real_estate_services",
str_detect(industry, "ProScieTechServ") ~ "Professional_scientific_and_technical_services",
str_detect(industry, "AdminSupServ") ~ "Administrative_and_support_services",
str_detect(industry, "PubAdmiSafety") ~ "Public_administration_and_safety",
str_detect(industry, "EducTrain") ~ "Education_and_training",
str_detect(industry, "HealthCareSocA") ~ "Health_care_and_social_assistance",
str_detect(industry, "ArtRecServ") ~ "Arts_and_recreation_services",
str_detect(industry, "OthServ") ~ "Other_services",
str_detect(industry, "ID_NS") ~ "Not Stated",
TRUE ~ industry)) %>%
select(-category) %>%
rename(count_industry = count)
```
```{r}
g58 <- map_dfr(census_paths[16], ~{
df <- read_csv(.x) %>%
select(-starts_with("P"), -contains("Tot")) %>%
mutate(SA4_CODE_2016 = as.character(SA4_CODE_2016)) %>%
pivot_longer(cols = -c(SA4_CODE_2016),
names_to = "category",
values_to = "count") %>%
unglue_unnest(category,
c("{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}_{hrs_max=\\d+}",
"{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}",
"{sex=[MF]}_{occupation=(Mng|Pro|TTW|CPS|CA|Sal|MOD|Lab|ID_NS|)}_{hrs_min=\\d+}over"
),
remove = FALSE)
})
```
```{r}
g58 <- g58 %>%
mutate(occupation =case_when(
str_detect(occupation, "Mng") ~ "Manager",
str_detect(occupation, "Pro") ~ "Professionals",
str_detect(occupation, "TTW") ~ "Technicians_and_trades_workers",
str_detect(occupation, "TechnicTrades_Wrs") ~ "Technicians_and_trades_workers",
str_detect(occupation, "CPS") ~ "Community_and_personal_service_workers",
str_detect(occupation, "CA") ~ "Clerical_and_administrative_workers",
str_detect(occupation, "Sal") ~ "Sales_workers",
str_detect(occupation, "MOD") ~ "Machinery_operators_and_drivers",
str_detect(occupation, "ID_NS") ~ "Not Stated",
TRUE ~ occupation)) %>%
select(-category) %>%
rename(count_occupation = count)
```
Population Count {.storyboard}
=========================================
### Population Map
```{r}
vicpopulation <- g51 %>%
group_by(SA4_CODE_2016) %>%
summarise(population = sum(count_industry)) %>%
ungroup()
population <- vicpopulation %>%
summarise(population=sum(population))
vicpopulation <- g51 %>%
group_by(SA4_CODE_2016, sex) %>%
summarise(population = sum(count_industry)) %>%
ungroup(sex) %>%
pivot_wider(names_from = sex,
values_from = population) %>%
rename(malepopulation = M,
femalepopulation = `F`) %>%
full_join(vicpopulation)
```
```{r}
vicpopulation %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=population)) +
geom_sf_text(aes(geometry= geometry,label=SA4_CODE_2016, colour="white"),
check_overlap=TRUE)+
theme_void()
```
> Most Victorian Population is concentrated in the Melbourne City Region.
> Other regions Though large have a less population
### Population Table
```{r}
vicpopulation %>%
kable(caption = "Victoriqn Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Age Distribution
```{r agedistributiong57, fig.height=4}
g57redundantage <- g57[rep(rownames(g57), g57$count_occupation), ]
g57redundantage %>%
ggplot()+
geom_density(mapping = aes( x = as.numeric(age_min),
colour = sex,
alpha = 0.5)) +
labs(x="age") +
scale_x_continuous()
```
***
- Most population is Middle Aged, 20 to 50 years.
- Old people are vulnerable with a low population.
- Age distribution is similar for both male and female population.
GenderLinearModel {.hidden}
=====================================
Column
-----------------------------------------
### Occupation: Male vs Female
```{r}
g57 %>%
pivot_wider(names_from = sex,
values_from = count_occupation) %>%
ggplot(mapping = aes(x = M, y = `F`, colour = occupation)) +
geom_point() +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population") +
scale_y_continuous(label=label_number()) +
scale_x_continuous(label=label_number()) +
theme(legend.position = "bottom")
```
### Occupation: Male vs Female
```{r}
g57occmfnest <- g57 %>%
pivot_wider(names_from = sex,
values_from = count_occupation) %>%
select(occupation, `F`, M ) %>%
group_by(occupation) %>%
nest() %>%
mutate(model = map(data, lm)) %>%
mutate(aug = map(model, broomstick::augment)) %>%
unnest(aug)
mvfocc <- ggplot(g57occmfnest,
aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = occupation)) +
geom_abline(slope = 1) +
theme(legend.position = "bottom") +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population")
# geom_text(aes(y = .fitted,label=industry, colour="white"),
# check_overlap=TRUE)
ggplotly(mvfocc) %>%
hide_legend()
```
Column
-----------------------------------------
### Industry: Male vs Female
```{r}
g51 %>%
pivot_wider(names_from = sex,
values_from = count_industry) %>%
ggplot(mapping = aes(x = M, y = `F`, colour = industry)) +
geom_point() +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population") +
scale_y_continuous(label=label_number()) +
scale_x_continuous(label=label_number()) +
theme(legend.position = "bottom")
```
### Industry: Male vs Female
```{r}
g51indmfnest <- g51 %>%
pivot_wider(names_from = sex,
values_from = count_industry) %>%
select(industry, `F`, M ) %>%
group_by(industry) %>%
nest() %>%
mutate(model = map(data, lm)) %>%
mutate(aug = map(model, broomstick::augment)) %>%
unnest(aug)
mvf <- ggplot(g51indmfnest,
aes(x = M)) +
# index represent splitting value,
#geom_point(aes(y = `F`, colour = industry, aplha=0)) +
geom_line(aes(y = .fitted, colour = industry)) +
geom_abline(slope = 1) +
theme(legend.position = "bottom") +
labs(title = "Population: Male vs Female",
x = "Male Population",
y= "Female Population")
# geom_text(aes(y = .fitted,label=industry, colour="white"),
# check_overlap=TRUE)
ggplotly(mvf) %>%
hide_legend()
```
Population: Gender {data-navmenu="Analysis"}
=========================================
Row
-----------------------------------------
- Highest people are are Health Care Professionals and the ratio between men to women is less than one.
- Similarly, in construction more men are employed as labourers.
- The population of women in the education sector is far exceeds that of men.
- Management & Commerce is the field that the most population have studied.
- More men have studied Engineering and Technology as compared to females. However, more people are employed in Health Care than in industries relating to Engineering.
- More women have studied Management and Commerce, however more men are employed as managers.
- Victorian population is educated upto level 7 and most are employed as professionals.
- However, a large population is employed as labourers when the population share of people who studied below high school is very less.
- [GenderLinearModel] shows the relationship between male and female populations
Column {.tabset}
-----------------------------------------
### Population by Education
- Most of the residents achieved the level 7, which refers to the bachelor degree, and there are almost twice as many female as male.
- Majority of male residents achieved at the level 3 and 4.
```{r}
g46a %>%
ggplot(mapping = aes(x = fct_reorder(afq_level,count_edu_lvl), y = count_edu_lvl, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(afq_level,count_edu_lvl, sex), y = count_edu_lvl, fill = sex)) +
labs(title = "Population Share of education level", y = "Number of students") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
### Population by Industry
```{r}
g51 %>%
ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(industry,count_industry, sex), y = count_industry, fill = sex)) +
labs(title = "Population Share of Industries", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
Column {.tabset}
-----------------------------------------
### Population by Field
```{r}
ggplot(g47, aes(x = reorder_within(field,count_field,sex),
y = count_field,
fill = sex)) +
geom_col() +
labs(x = "Field",
y = "number of observations",
title = "Field by gender") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
### Population by Occupation
```{r}
g57 %>%
ggplot(mapping = aes(x = fct_reorder(occupation,count_occupation), y = count_occupation, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(occupation,count_occupation, sex), y = count_occupation, fill = sex)) +
labs(title = "Population Share of Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
theme(axis.title.y = element_blank())+
coord_flip()
```
Population: Age {data-orientation=columns data-navmenu="Analysis"}
=========================================
Column {data-width=30%}
-----------------------------------------
- As seen from the age distribution, all sectors have people in the age group 25 to 45.
- The age group, 25-35 shares the highest population in every sector.
- A key observation is that some people aged over 75 are still working.
Column {.tabset}
-----------------------------------------
### Population by Education
```{r}
popeduage <- g46a %>%
group_by(educationlevel, age_min) %>%
summarise(count_eduage = sum(count_edu_lvl)) %>%
ungroup()
nodes <- data.frame(node = unique(popeduage$educationlevel),
category = "education level") %>%
full_join(data.frame(node = unique(popeduage$age_min),
category = "age"))
popeduage <- popeduage[,c(1,2,3,1)]
networkeduage <- graph_from_data_frame(d=popeduage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkeduage %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_eduage,edge_width = count_eduage,edge_color = educationlevel),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
### Population by Industries
```{r}
popindage <- g51 %>%
group_by(industry, age_min) %>%
summarise(count_indage = sum(count_industry)) %>%
ungroup()
nodes <- data.frame(node = unique(popindage$industry),
category = "industry") %>%
full_join(data.frame(node = unique(popindage$age_min),
category = "age"))
popindage <- popindage[,c(1,2,3,1)]
networkindage <- graph_from_data_frame(d=popindage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkindage %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_indage,edge_width = count_indage,edge_color = industry),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
Row {.tabset}
-----------------------------------------
### Population by Field
```{r}
popfieldage <- g47 %>%
group_by(field, age_min) %>%
summarise(count_fieldage = sum(count_field)) %>%
ungroup() %>%
filter(!is.na(field))
nodes <- data.frame(node = unique(popfieldage$field),
category = "field") %>%
full_join(data.frame(node = unique(popfieldage $age_min),
category = "age"))
popfieldage <- popfieldage [,c(1,2,3,1)]
networkfieldage <- graph_from_data_frame(d= popfieldage,directed = TRUE, vertices = nodes)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkfieldage %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_fieldage, edge_width = count_fieldage,edge_color = field),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
### Population by Occupation
```{r}
popoccage <- g57 %>%
group_by(occupation, age_min) %>%
summarise(count_occage = sum(count_occupation)) %>%
ungroup()
nodesocc <- data.frame(node = unique(popoccage$occupation),
category = "occupation") %>%
full_join(data.frame(node = unique(popindage$age_min),
category = "age"))
popoccage <- popoccage[,c(1,2,3,1)]
networkoccage <- graph_from_data_frame(d=popoccage,directed = TRUE, vertices = nodesocc)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
set.seed(122)
networkoccage %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_occage,edge_width = count_occage,edge_color = occupation),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
Population: Age {data-orientation=rows data-navmenu="Analysis"}
=========================================
Row {.tabset}
-----------------------------------------
### Population by Education, Age
```{r}
g46a %>%
group_by(afq_level, age_min) %>%
summarise(population=sum(count_edu_lvl)) %>%
group_by(afq_level) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Education: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Population by Industries, Age
```{r}
g51 %>%
group_by(industry, age_min) %>%
summarise(population=sum(count_industry)) %>%
group_by(industry) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Industry: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
Row {.tabset}
-----------------------------------------
### Population by Field
```{r}
g47 %>%
group_by(field, age_min) %>%
summarise(population=sum(count_field)) %>%
group_by(field) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Field: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
### Population by Occupation
```{r}
g57%>%
group_by(occupation, age_min) %>%
summarise(population=sum(count_occupation)) %>%
group_by(occupation) %>%
slice_max(population, n=1) %>%
arrange(age_min)%>%
kable(caption = "Occupation: Population") %>%
kable_styling(bootstrap_options = c("striped", "hover"), latex_options = "hold_position")
```
Region : Sectors {data-orientation=rows data-navmenu="Regions"}
=========================================
Column
-----------------------------------------
- The bar plots represent the SA4 regions and its working population with respect to their education levels, field of study, industry of employment and occupations.
- It can be observed that the region 206 had the most number of people with highest education levels which justifies that highest number of people in region 2016 were employed as professionals in their respective industries.
- Management and commerce, engineering and technology were the fields of study for most population and agriculture, environment and mixed field programs had the least population share.
- Health care, manufacturing and retail trade were the industries with most population while people were employed most for occupations of Professionals and Managers.
Column
-----------------------------------------
### Education Level: Region
```{r bestedu, fig.cap="Best education level of each region"}
popareaedu <- g46a %>%
group_by(SA4_CODE_2016, afq_level) %>%
summarise(count_eduarea = sum(count_edu_lvl)) %>%
ungroup()
bestedu <- popareaedu %>%
select(1:3) %>%
group_by(afq_level) %>%
slice_max(count_eduarea) %>%
arrange(SA4_CODE_2016)
bestedu %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(afq_level,count_eduarea, SA4_CODE_2016), y = count_eduarea, fill = afq_level)) +
labs(title = "Region and Best education level",
x = "Fields with region code",
y = "Number of Students") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
### Industry: Region
```{r popareaind, fig.cap=""}
popindarea <- g51 %>%
group_by(SA4_CODE_2016, industry) %>%
summarise(count_indarea = sum(count_industry)) %>%
ungroup()
popareaindmax <- popindarea %>%
select(1:3) %>%
group_by(industry) %>%
slice_max(count_indarea) %>%
arrange(SA4_CODE_2016)
popareaindmax %>%
ggplot(mapping = aes(x = fct_reorder(industry,count_industry), y = count_industry, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(industry,count_indarea, SA4_CODE_2016), y = count_indarea, fill = industry)) +
labs(title = "Region and Best Industry", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
Column
-----------------------------------------
### Field: Region
```{r bestfield, fig.cap="Best field of each region"}
popareafield <- g47 %>%
group_by(SA4_CODE_2016, field) %>%
summarise(count_fieldarea = sum(count_field)) %>%
ungroup()
bestfield <- popareafield %>%
select(1:3) %>%
group_by(field) %>%
slice_max(count_fieldarea) %>%
arrange(SA4_CODE_2016)
bestfield %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(field,count_fieldarea, SA4_CODE_2016), y = count_fieldarea, fill = field)) +
labs(title = "Region and Best Field",
x = "Fields with region code",
y = "Number of Students") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
### Occupation: Region
```{r popareaocc, fig.cap=""}
popoccarea <- g57 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occarea = sum(count_occupation)) %>%
ungroup()
popareaoccmax <- popoccarea %>%
select(1:3) %>%
group_by(occupation) %>%
slice_max(count_occarea) %>%
arrange(SA4_CODE_2016)
popareaoccmax %>%
ggplot(mapping = aes(x = fct_reorder(occupation,count_occarea), y = count_occarea, fill = sex)) +
geom_col(mapping = aes(x = reorder_within(occupation,count_occarea, SA4_CODE_2016), y = count_occarea, fill = occupation)) +
labs(title = "Region and Best Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
(G52 Analysis) {data-navmenu="G52"}
=============================
Row{data-width=420}
-----------------------------------------------------------------------
### Chart A
- It can be observed from both figures that overall females worked more than men. However, as the number of work-hours increased men have worked more than women.
```{r, include=FALSE}
p1 <- g52 %>%
mutate(hr_min = as.numeric(hr_min)) %>%
summarise(hr_min = sum(hr_min, na.rm = TRUE))
p2 <- g52 %>%
mutate(hr_max = as.numeric(hr_max)) %>%
summarise(hr_max = sum(hr_max, na.rm = TRUE))
```
```{r hr_plots, fig.show='hold', out.width="50%"}
p1 <- g52 %>%
ggplot(g52,
mapping = aes(x = hr_min,
y = count_industry,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Minimum Hours") +
ylab("Count") +
ggtitle("Min hours worked for Industries")
p1
p2 <- g52 %>%
ggplot(g52,
mapping = aes(x = hr_max,
y = count_industry,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Maximum Hours") +
ylab("Count") +
ggtitle("Max hours worked for Industries")
p2
```
Row{data-height=200}
-----------------------------------------------------------------------
### Chart B
- It can be observed from figure that industries like health care, education and training, construction and Professional and technical services have more working population as the working hours increased. Mining, electricity, gas, water showed low working population irrespective of work hours.
```{r ind_hrs}
g52redundanthrs <- g52[rep(rownames(g52), g52$count_industry), ]
hrindcount <- g52redundanthrs %>%
ggplot(mapping = aes(x = hr_min, y = industry)) +
geom_count() +
labs(title = "Population: Industries and hours", x = "Hours") +
theme(axis.title.y = element_blank())
ggplotly(hrindcount)
```
(G58 Analysis) {data-navmenu="G52"}
=============================
Row{data-width=400}
-----------------------------------------------------------------------
### Chart C
- It can be observed from figure that overall females worked more than men at all occupations. Although, for maximum hours worked, as number of working-hours increased, the number of men and women remained the same.
```{r, include=FALSE}
p3 <- g58 %>%
mutate(hrs_min = as.numeric(hrs_min)) %>%
summarise(hrs_min = sum(hrs_min, na.rm = TRUE))
p4 <- g58 %>%
mutate(hrs_max = as.numeric(hrs_max)) %>%
summarise(hrs_max = sum(hrs_max, na.rm = TRUE))
```
```{r hrs_plots, fig.show='hold', out.width="50%"}
p3 <- g58 %>%
ggplot(g58,
mapping = aes(x = hrs_min,
y = count_occupation,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Minimum Hours") +
ylab("Count") +
ggtitle("Min hours worked at Occupation")
p3
p4 <- g58 %>%
ggplot(g58,
mapping = aes(x = hrs_max,
y = count_occupation,
fill = sex)) +
geom_bar(stat = "identity",
position = "dodge") +
theme_bw() +
xlab("Maximum Hours") +
ylab("Count") +
ggtitle("Max hours worked at Occupation")
p4
```
Row{data-height=250}
-----------------------------------------------------------------------
### Chart D
- It can be observed from figure that the most number of employees in the SA4 regions are employed in the occupations of Professionals, Managers and Technicians and trade workers. Professionals accounted for highest number of employees for region 206, while machinery operators and drivers accounted for the least number of employees for region 213 respectively.
```{r popareaoccupation, fig.cap=""}
popareaoccupation <- g58 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occupationarea = sum(count_occupation)) %>%
ungroup()
popareaoccupationmax <- popareaoccupation %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_occupationarea) %>%
arrange(SA4_CODE_2016)
popareaoccupationmax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
geom_sf_text(aes(geometry= geometry,label=occupation, colour="white"), check_overlap=TRUE)+
theme_void() +
theme(legend.position = "bottom")
bestfield <- popareaoccupation %>%
select(1:3) %>%
group_by(occupation) %>%
slice_max(count_occupationarea) %>%
arrange(SA4_CODE_2016)
bestfield %>%
ggplot() +
geom_col(mapping = aes(x = reorder_within(occupation,count_occupationarea, SA4_CODE_2016), y = count_occupationarea, fill = occupation)) +
labs(title = "Region and Best Occupation", y = "Number of Employees") +
scale_y_continuous(label=label_number()) +
coord_flip() +
theme(legend.position = "none")
```
Maps {data-orientation=column data-navmenu="Regions"}
=========================================
Column
-----------------------------------------
- The maps represent the SA4 regions and the distribution of population by their education levels, industries, field of study and occupations respectively.
- Most population has completed education level 7 with management and commerce as their respective fields of study.
- It can be observed that the highest number of people are employed in the occupations: Professionals, Managers and Technicians and trade workers.
- Major industry in the city side is healthcare and the country regions are more operational in agricultural activities.
Column
-----------------------------------------
### Education Level: Region
```{r edmap, fig.cap="Spatial Education Level Distribution"}
popareaedu <- g46a %>%
group_by(SA4_CODE_2016, afq_level) %>%
summarise(count_eduarea = sum(count_edu_lvl)) %>%
ungroup()
popareaedumax <- popareaedu %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_eduarea) %>%
arrange(SA4_CODE_2016)
popareaedumax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=afq_level)) +
geom_sf_text(aes(geometry= geometry,label=afq_level), colour="black", check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "bottom")
```
### Industry: Region
```{r indmap, fig.cap="Spatial Industry Distribution"}
popindarea <- g51 %>%
group_by(SA4_CODE_2016, industry) %>%
summarise(count_indarea = sum(count_industry)) %>%
ungroup()
popindareamax <- popindarea %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_indarea)
popindareamax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=industry)) +
geom_sf_text(aes(geometry= geometry,label=industry ), colour="black",check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "bottom")
#major industry in cbd is helthcare
#major industry in country side is agriculture
```
Column
-----------------------------------------
### Field: Region
```{r fieldmap, fig.cap="Spatial Study Field Distribution"}
popareafield <- g47 %>%
group_by(SA4_CODE_2016, field) %>%
summarise(count_fieldarea = sum(count_field)) %>%
ungroup()
popareafieldmax <- popareafield %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_fieldarea) %>%
arrange(SA4_CODE_2016)
popareafieldmax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=field)) +
geom_sf_text(aes(geometry= geometry,label=field), colour="black", check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "bottom")
```
### Occupation: Region
```{r occmap, fig.cap="Spatial Occupation Distribution"}
popareaocc <- g57 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occarea = sum(count_occupation)) %>%
ungroup()
popoccareamax <- popareaocc %>%
select(1:3) %>%
group_by(SA4_CODE_2016) %>%
slice_max(count_occarea)
popoccareamax %>%
full_join(sa4_geomap,
by = c("SA4_CODE_2016"="SA4_CODE")) %>%
ggplot() +
geom_sf(mapping = aes(geometry= geometry, fill=occupation)) +
geom_sf_text(aes(geometry= geometry,label=occupation), colour="black", check_overlap=TRUE)+
theme_void() +
scale_fill_brewer() +
theme(legend.position = "bottom")
```
Networks {data-orientation=column data-navmenu="Regions"}
=========================================
Column
-----------------------------------------
### Education Level: Region
```{r}
popeduarea <- g46a %>%
group_by(SA4_CODE_2016, afq_level) %>%
summarise(count_eduarea = sum(count_edu_lvl)) %>%
ungroup()
nodesarea <- data.frame(node = unique(popeduarea$SA4_CODE_2016),
category = "area") %>%
full_join(data.frame(node = unique(popeduarea$afq_level),
category = "afq_level"))
popeduarea <- popeduarea[,c(1,2,3,2)]
networkeduarea <- graph_from_data_frame(d=popeduarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkeduarea %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_eduarea,edge_width = count_eduarea,edge_color = afq_level),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
### Industry: Region
```{r }
popindarea <- g51 %>%
group_by(SA4_CODE_2016, industry) %>%
summarise(count_indarea = sum(count_industry)) %>%
ungroup()
nodesarea <- data.frame(node = unique(popindarea$SA4_CODE_2016),
category = "area") %>%
full_join(data.frame(node = unique(popindarea$industry),
category = "industry"))
popindarea <- popindarea[,c(1,2,3,2)]
networkindarea <- graph_from_data_frame(d=popindarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkindarea %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_indarea,edge_width = count_indarea,edge_color = industry),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
Column
-----------------------------------------
### Field: Region
```{r}
popfieldarea <- g47 %>%
group_by(SA4_CODE_2016, field) %>%
summarise(count_fieldarea = sum(count_field)) %>%
ungroup()
nodesarea <- data.frame(node = unique(popfieldarea$SA4_CODE_2016),
category = "area") %>%
full_join(data.frame(node = unique(popfieldarea$field),
category = "field"))
popfieldarea <- popfieldarea[,c(1,2,3,2)]
networkfieldarea <- graph_from_data_frame(d=popfieldarea,directed = TRUE, vertices = nodesarea)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkfieldarea %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_fieldarea,edge_width = count_fieldarea,edge_color = field),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
### Occupation: Region
```{r }
popareaocc <- g57 %>%
group_by(SA4_CODE_2016, occupation) %>%
summarise(count_occarea = sum(count_occupation)) %>%
ungroup()
nodesareaocc <- data.frame(node = unique(popareaocc$SA4_CODE_2016),
category = "area") %>%
full_join(data.frame(node = unique(popareaocc$occupation),
category = "occupation"))
popareaocc <- popareaocc[,c(1,2,3,2)]
networkoccarea <- graph_from_data_frame(d=popareaocc,directed = TRUE, vertices = nodesareaocc)
a <- grid::arrow(type = "closed", length = unit(0.2,"inches"))
networkoccarea %>%
ggraph(layout = "stress") +
geom_edge_link2(aes(edge_alpha = count_occarea,edge_width = count_occarea,edge_color = occupation),arrow = a) +
geom_node_point(aes(size = 2, colour =category) )+
geom_node_text(aes(label = name), repel = TRUE, point.padding = unit(0.15, "lines")) +
theme_void() +
theme(legend.position = "none")
```
Conclusion {data-orientation=column}
=========================================
Column
-----------------------------------------------------------------------
Conclusion
The education levels, field of study, industry of employment and occupation was studied for the Victorian SA4 level populations for the distributions according to gender and sex. The tables and plots were compared to mark the covariations between the population distributions.For example, the population trend between the field of study and industry of employment. Networks were drawn based on the population weights to analyze these trends. Some of the trends like more men were employed as managers when more women had studied management were found to be interesting. Cholropeth maps were made to analyze these trends spatially.
The goal of this report is to create a data story from these statistical summaries to enumerate the facts from the data and link them to the real world. The data provided by the Australian Bureau of Statistics is an aggregated open data and in no form identifies individuals who participated in the census. The ABS aims to integrate the census data with other datasets to make this census data more interesting. Thus, we aim to do the same and bring some interesting data stories as we progress building this report.
References {data-orientation=column}
=========================================
### Data Sources
- [Australian Bureau of Statistics 2016](https://www.abs.gov.au/websitedbs/censushome.nsf/home/2016)
- Australian Bureau of Statistics (2016) 'Census GeoPackages', [GeoPackages](https://datapacks.censusdata.abs.gov.au/geopackages/), accessed May 2021.
- Australian Bureau of Statistics (2016) 'Census DataPacks', [Census DataPacks](https://datapacks.censusdata.abs.gov.au/datapacks/), accessed May 2021.
- Australian Bureau of Statistics (2016) 'Census DataPacks', [Census DataPacks](https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/2011.0.55.001~2016~Main%20Features~DataPacks~5) , accessed May 2021.
- Australian Bureau of Statistics [(2016)](https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by%20Subject/2900.0~2016~Main%20Features~Understanding%20the%20Census%20and%20Census%20Data~1)
- Australian Statistical Geography Standard [(ASGS)](https://www.abs.gov.au/websitedbs/D3310114.nsf/home/Australian+Statistical+Geography+Standard+(ASGS))